Analysis of Workstation Disk Array High Availability Modes Steve Watt / Network Server Division In order to increase capacity and enhance performance in today's workstation environment, a disk storage system requires multiple disk mechanisms. However, when configuring numerous disk mechanisms into a storage system, the combined Mean Time Between Failure (MTBF) decreases making users more vulnerable to downtime and data loss due to a disk mechanism failure. To address this concern in critical environments, disk arrays offer protection from disk mechanism failures with several high availability modes. With each of the high availability modes, if one of the disk mechanisms fail, user data remains accessible and the array can reconstruct missing information upon replacement of the failed mechanism. HP WORKSTATION DISK ARRAY HIGH AVAILABILITY MODES: RAID 0/1, Mirroring ------------------- RAID 0/1 offers data protection through disk mirroring in which a disk array with six mechanisms uses three mechanisms for data storage and three for mirrored data. This operating mode provides the highest level of data protection because each disk has a mirrored copy. RAID 3, Byte Striping with Parity --------------------------------- RAID 3 configurations utilize five disk mechanisms. User data is striped on a byte basis across four of the disks and the fifth disk is used to store parity information. The disk array creates the parity on the fifth disk by using an "exclusive-or" encoding scheme. RAID 5, Block Striping with Parity ---------------------------------- RAID 5 also utilizes five disk mechanisms and the "exclusive-or" parity scheme. However, with RAID 5 parity information is spread across all five disks. As with RAID 3, RAID 5 makes 80% of the total storage capacity available for user data. High Availability modes enable customers to connect large capacitites (up to 228 Gbytes) to their S/700 workstation or server while reducing the risk of downtime and data loss due to a disk mechanism failure. In fact, when configured in a high availability mode, HP workstation disk arrays provide FOUR to FIVE TIMES the protection from downtime, and even greater protection from data loss, when compared to an equal number of independent HP disk drives. MTBF / MTBDNA CALCULATIONS: Component MTBF (hrs) --------- ---------- 5.25" Mechanisms 150,000 per mechanism 3.5" Mechanisms 150,000 per mechanism Power Supply 1,000,000 Fan 300,000 Controller 300,000 Misc. (Cable, chasis, etc.) 5,000,000 The following calculations were used to determine the magnitude of protection from downtime. Since Disk Arrays in a high availability mode provide access to data despite a disk mechansism failure, the measure of Mean Time Between Data Not Available (MTBDNA) is more relevant to users than Mean Time Between Failures (MTBF). When a disk array is not in a high availability mode, MTBDNA for the array equals it's MTBF. The MTBDNA calculations assume a 24 hour restoration time (replacement and rebuild of data) for a failed disk mechanism. Model 420SA (3.5" disk mechanisms, Qty 5; 1 cabinet) ----------------------------------------------------- MTBF = 1 / [5/150,000 + 1/1,000,000 + 1/300,000 + 1/300,000 + 1/5,000,000] MTBF = 24,271 hrs ------ RAID 3 or 5: MTBDNA = 1 / [(4*5*24/150,000*150,000) + 1/1,000,000 + 1/300,000 + 1/300,000 + 1/5,000,000] MTBDNA = 126,775 hrs ----------- Model 420SA (3.5" disk mechanisms, Qty 6; 1 cabinet) ----------------------------------------------------- MTBF = 1 / [6/150,000 + 1/1,000,000 + 1/300,000 + 1/300,000 + 1/5,000,000] MTBF = 20,891 hrs ---------- RAID 0/1: MTBDNA = 1 / [(3*2*24/150,000*150,000) + 1/1,000,000 + 1/300,000 + 1/300,000 + 1/5,000,000] MTBDNA = 127,015 hrs ----------- Model 1350SA (5.25" disk mechanisms, Qty 5; 2 cabinets) -------------------------------------------------------- MTBF = 1 / [5/150,000 + 2/1,000,000 + 2/300,000 + 1/300,000 + 1/5,000,000] MTBF = 21,962 hrs ---------- RAID 3 or 5: MTBDNA = 1 / [(4*5*24/150,000*150,000) + 2/1,000,000 + 2/300,000 + 1/300,000 + 1/5,000,000] MTBDNA = 81,824 hrs ---------- Model 1350SA (5.25" disk mechanisms, Qty 6; 2 cabinets) ------------------------------------------------------- MTBF = 1 / [6/150,000 + 2/1,000,000 + 2/300,000 + 1/300,000 + 1/5,000,000] MTBF = 19,157 hrs ---------- RAID 0/1: MTBDNA = 1 / [(3*2*24/150,000*150,000) + 2/1,000,000 + 2/300,000 + 1/300,000 + 1/5,000,000] MTBDNA = 81,924 hrs ---------- The calculations show that when in a high availability mode, chances of downtime due to a disk mechanisms failure are nearly eliminated. Downtime becomes a function of the failure rates of the power supply, fan, controller and miscellaneous components. Since disk mechanisms failures typically account for 75-80% of all disk storage system failures for configurations this size, the high availability modes increase protection from downtime due to disk failure by FOUR to FIVE times!! PROTECTION FROM DATA LOSS: In addition to increasing the availability of data, disk arrays also provide protection from data loss caused by a failure. It is dificult to estimate the magnitude of the protection from data loss provided by an array, however, since the most probable type of failure to cause data loss is a disk mechanism failure, it follows that a high availability mode will offer at least FOUR to FIVE times the protection from data loss. If a system has the ability to identify and avoid data loss in the event of a power supply, fan or controller failure, the increased protection from data loss with a disk array in a high availability mode will be even greater. SUMMARY OF WORKSTATION DISK ARRAY HIGH AVAILABILITY MODES RAID DATA STORAGE TOTAL STORAGE MTBF MTBDNA LEVEL MODEL (GB) (GB) (hrs) (hrs) ------ ----- ------- ------- ------- ------ 0/1 1350SA* 4.05 8.10 19,157 81,924 0/1 420SA* 1.27 2.53 20,891 127,015 3 or 5 1350SA 5.40 6.75 21,962 81,824 3 or 5 420SA 1.69 2.11 24,271 126,775 * - customer must purchase a 6th disk mechanism for RAID 0/1